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ivluch  effort  has  been  expended  in  the  field  of  analytical  chemistry  toward  the  development 
of  selective  sensors.  The  ultimate  goal  of  this  area  of  research  is  to  build  sensors  that  respond  to 
only  one  analyte  while  ignoring  all  other  analytes  (interferents)  that  are  present  in  the  samples. 
Perhaps  the  most  common  example  of  the  result  of  this  effort  is  the  development  of  ion  selective 
electrodes  (ISE)  for  the  determination  of  ion  concentrations  in  solutions.  While  some  ISEs  are 
relatively  selective  for  the  desired  ions,  all  suffer  from  some  degree  of  non-specificity. 
Unfortunately,  in  the  field  of  sensor  development  this  is  a  common  occurrence. 

Another  approach  to  solving  the  problem  of  interferents  is  to  use  multiple  non-selective 
sensors  and  employ  multivariate  mathematics''^  to  perform  the  calibration  and  prediction.  This 
was  the  approach  taken  in  two  recent  papers\§^)  where  arrays  of  ISEs  were  used  to  quantify 
mixtures  of  analytes.  Analyte  quantitation  was  achieved  using  either  linear^  or  non-linear^S^ 
regression  techniques  to  model  the  response  of  the  electrodes  to  the  concentration  of  analytes  in 
mixture  samples.  In  both  papers,  the  sensor  responses  were  assumed  to  obey  the  relationship 


found  in  the  set  of  extended  Nemst 


c  or  +  J/ 

.  .  - 


equations^ 


Ejj  =  Ej°  +  Sj  log(  aj  +  X  Kjl  ail  ) 


where  Ey  is  the  potential  of  the  jth  electrode  in  the  array  measured  for  the  ith  sample  with  respect 
to  a  suitable  reference  electrode;  Ej°  is  the  intercept  potential  of  the  jth  electrode;  Sj  is  the  slope  of 
the  response  of  the  electrode  in  the  absence  of  any  interferents  (analytes  to  which  the  sensor 
responds  for  which  it  was  not  designed);  a;  is  the  activity  of  the  analyte  for  which  the  electrode  was 
made;  a,i  is  the  activity  of  the  1th  interfering  ion;  and  Kjj  is  the  selectivity  coefficient  of  the  jth 
electrode  with  respect  to  the  1th  interfering  ion.  In  this  study,  only  iwo  analytes  were  present 
(sodium  and  potassium)  and  therefore  equation  1  can  be  rewritten  in  the  following  form, 


Ejj  =  Ej°  +  Sj  log(  ch\a+  +  Kj  CjK+  ) 


wWSgiSvS 


where  the  order  of  the  analytes  has  been  arbitrarily  chosen  and  the  activities  have  been  replaced  by 
concentrations  (ciNa+,  CiK+). 

Otto  and  Thomas  (2)  used  calibration  samples  containing  only  one  analyte  to  determine  the 
slopes  (Sj)  for  each  electrode,  rearranged  equation  2,  and  used  multiple  linear  regression  (4)  and 
partial  least  squares  (5)  to  determine  the  remaining  two  parameters  for  the  model  (Ej°  and  Kj). 
Beebe  et  al.  (3)  used  non-linear  regression  based  on  a  simplex  algorithm  and  multiple  linear 
regression  (MLR)  to  determine  the  model  parameters  with  no  a  priori  information  concerning  the 
slope  of  the  electrode  responses. 

This  study  will  use  a  relatively  new  method  of  analysis  called  projection  pursuit  (6)  to 
determine  the  model  parameters.  Projection  pursuit  is  a  nonparametric  multivariate  technique  that 
allows  the  analyst  to  calibrate  a  system  with  no  a  priori  information  about  the  functional  form  of  the 
calibration  model.  In  other  words,  given  the  responses  of  J  sensors  to  I  samples  containing 
mixtures  of  K  analytes,  projection  pursuit  can  find  an  appropriate  form  for  the  model  (log, 
parabolic,  linear,  etc.)  as  well  as  the  model  parameters.  In  the  more  common  calibration 
procedures,  knowledge  of  the  functional  form  is  an  essential  component.  For  example,  in  building 
the  calibration  model  for  an  experiment  involving  absorption  measurements,  it  is  common  practice 
to  assume  the  instrument  response  follows  Beer's  Law  and  use  a  regression  procedure  to  build  the 
model.  If  the  linearity  criterion  is  not  obeyed,  the  derived  models  are  not  valid,  and  the  true 
models  cannot  be  obtained  using  the  normal  linear  regression  techniques.  Similarly,  previous 
papers  treating  the  calibration  of  arrays  of  ISEs  (2,3)  based  the  calibration  models  on  the 
assumption  that  the  electrodes  obey  a  known  response  equation.  Not  knowing  this  "functional 
form"  can  make  the  calibration  quite  inaccurate.  Furthermore,  unexpected  departure  from  the 
assumed  functional  form  can  yield  erroneous  results.  For  these  reasons,  non-parametric  methods 
in  general  and  projection  pursuit  in  particular  are  powerful  and  versatile  tools  with  a  wide  variety  of 
possible  applications. 


Theory 


Notation  Throughout  this  paper  matrices  will  be  represented  with  bold  uppercase  letters 
(R);  vectors  with  bold  lower  case  letters  such  as  rj  to  signify  the  jth  column  of  R;  and  scalar 
quantities  with  plain  upper  and  lower  case  letters  (I,  i). 

Projection  Pursuit  In  general,  the  goal  of  all  multivariate  calibration  procedures  is  to 
estimate  model  parameters  relating  an  IxJ  matrix  R  containing  the  responses  of  J  sensors  or 
wavelengths  to  I  calibration  samples  to  an  IxK  matrix  C  containing  the  concentrations  or 
characteristics  of  K  analytes  in  the  same  I  samples.  Once  an  estimate  of  the  model  parameters  are 
obtained,  it  is  possible  to  predict  the  concentration  of  analytes  in  a  new  sample  of  unknown 
concentrations. 

If  the  form  of  the  model  relating  R  to  C  is  unknown,  often  the  analyst  "assumes"  linearity 
and  hopes  for  success.  Another  approach  is  to  guess  at  a  functional  form  and  test  its 
reasonableness  with  the  calibration  data.  The  problem  with  the  latter  approach  is  that  there  are  too 
many  functional  forms  from  which  to  choose.  There  is  literally  an  infinite  number  of  models  that 
can  be  constructed  even  for  the  simplest  case  of  one  response  vector  and  one  concentration  vector. 

Projection  pursuit  limits  the  number  of  choices  by  allowing  the  calibration  data  to  determine 
an  appropriate  model.  The  procedure  projects  the  K  dimensional  calibration  data  (the  K  columns 
of  C)  into  a  smaller  space  while  retaining  the  multivariate  structure.  In  other  words,  it  determines 
the  linear  combination  of  predictor  variables  that  is  "best"  related  to  the  columns  of  R.  As  in  any 
calibration  procedure,  the  analyst  must  decide  on  which  of  the  calibration  matrices  (concentration 
versus  response)  to  use  as  the  predictor  variables.  In  this  study,  the  concentration  vectors  were 
used  in  this  role  because  the  errors  in  the  responses  are  presumed  larger  than  those  in  the 
concentrations  (see  equation  3  where  the  projection  pursuit  model  assumes  the  errors  are  primarily 
in  rj)  and  because  of  an  interest  in  estimating  the  coefficients  for  the  columns  of  C  to  compare  to 
the  earlier  study. 


»  *■  *-  -  ^  f 


When  using  this  approach,  projection  pursuit  can  be  used  to  estimate  the  model  for  one  ion 
selective  electrode  at  a  time.  The  resulting  projection  pursuit  model  for  the  jth  sensor  is  as  follows, 

K 

rj  =  Gj  (  X  akj  ci<)  +  £j  3) 

k=l 

where  rj  is  an  Ixl  vector  corresponding  to  the  response  of  the  jth  sensor  to  I  calibration  samples; 
akj  is  the  coefficient  for  the  kth  predictor  variable  (cvJ  where  £cxk  =  1;  Gj  represents  a  smooth  of 
the  linear  combination  described  by  the  a  quantities  (an  explanation  of  this  follows);  and  ej  is  the 
error  associated  with  fitting  the  data  for  the  jth  sensor. 

For  each  of  the  j  sensors,  projection  pursuit's  goal  is  to  determine  the  a  quantities  and  a 
smooth  that  minimizes  the  associated  error  (ej).  It  achieves  this  by  first  searching  for  appropriate  a 
quantities,  calculating  a  smooth  given  that  linear  combination  of  predictor  variables,  and  calculating 
the  error.  It  then  iteratively  searches  for  the  linear  combination  that  minimizes  the  error.  The 
optimal  linear  combination  is  that  which  yields  the  most  narrow  band  of  points  when  plotted 
against  rj.  A  simple  hypothetical  problem  using  simple  graphics  will  make  this  more  clear. 

For  illustration  purposes,  it  will  be  assumed  that  an  analyst  has  obtained  a  calibration  set  of 
the  response  of  one  ISE  to  I  mixtures  of  two  analytes.  Let  r  be  the  Ixl  vector  of  responses  of  the 
ISE  to  the  I  samples  and  C  be  the  1x2  matrix  of  concentrations.  Again,  for  illustration  purposes, 
assume  that  the  first  projection  pursuit  iteration  yielded  0.37ci  +  0.93C2  as  the  initial  linear 
combination  of  predictor  variables  to  describe  r.  From  equation  3  it  is  clear  that  oq  =  0.37  and  CC2 
=  0.93.  Projection  pursuit  next  determines  whether  or  not  this  choice  of  a  quantities  is  reasonable 
by  "viewing"  the  relationship  between  r  and  the  chosen  linear  combination  of  predictor  variables 
(Fig.  1).  (In  practice,  the  method  "views"  the  result  mathematically,  and  plots  are  included  here 
only  for  instructional  purposes.) 

To  determine  the  acceptability  of  the  chosen  a  quantities,  projection  pursuit  determines  a 
smooth  (7)  going  through  the  points  in  figure  1.  A  smooth  is  a  continuous  line  that  describes  the 
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general  behavior  of  all  of  the  points.  The  value  of  the  smooth  at  any  point  is  based  on  the  local 
average  of  q  points  on  either  side  where  q  defines  the  bandwidth  of  the  smooth.  The  smooth  in 
figure  2  was  obtained  using  running  averages  where  q=5  followed  by  a  polynomial  smooth  of 
degree  3.  The  important  point  of  this  illustration  is  to  demonstrate  how  a  smooth  describes  the 
general  trend  of  of  the  data.  The  degree  to  which  the  smooth  describes  each  point  individually  (as 
opposed  to  the  overall  trend)  can  be  determined  by  adjusting  the  value  of  q.  Although  smooths  are 
not  common  in  calibration  procedures,  some  common  methods  of  analysis  are  very  similar  in  their 
results.  For  example,  the  ordinary  regression  line  can  be  thought  of  as  a  type  of  smooth  describing 
the  data  with  the  restriction  of  linearity.  Another  commonly  employed  procedure  is  the  use  of 
spline  functions  (8)  to  approximate  data  when  derivative  spectra  are  desired.  The  spline  function 
breaks  the  data  into  sections  and  fits  polynomials  to  each  section  to  form  a  continuous  curve. 

The  smooth  used  by  the  projection  pursuit  is  more  like  the  spline  function  than  the 
regression  line  in  that  it  is  not  based  on  any  model  criterion  :uch  as  linearity.  The  smooth  is  based 
solely  on  the  behavior  of  the  data  and  does  not  have  any  functional  form.  However,  in  situations 
where  there  is  a  real  functional  relationship  (i.e.  log,  exponential)  between  the  variables  being 
smoothed,  the  data  should  reflect  the  relationship  and  the  smooth  should  closely  approximate  the 
true  model.  For  more  details  concerning  the  smooth  employed  by  projection  pursuit  see 
reference  6. 

Returning  to  the  illustration,  once  a  linear  combination  of  predictor  variables  is  chosen, 
projection  pursuit  calculates  a  smooth.  The  deviation  of  the  points  from  the  smooth  is  then  used  to 
determine  whether  the  linear  combination  of  predictor  variables  (c0  is  acceptable  or  if  the 
procedure  should  continue  iterating.  If  the  fit  is  not  acceptable,  a  non-linear  regression  technique 
(based  on  the  method  of  Rosenbrock  (9))  is  used  to  choose  a  new  set  of  a  quantities  to  examine. 
A  new  smooth  and  corresponding  fit  are  calculated,  and  the  process  is  repeated  until  the  deviation 
from  the  smooth  converges.  In  the  present  hypothetical  example  it  will  be  assumed  that  the  a 
quantities  determined  in  the  first  iteration  (a  =  [0.37  0.93])  were  not  the  best  possible  and  that 
the  procedure  continued  searching  for  new  a  quantities  until  the  following  model  was  determined. 
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r  =  G(0.85  ci  +  0.53  c2)  4) 

Figure  3  is  a  plot  of  r  versus  this  linear  combination  of  c  vectors  where  the  desired  fit  of  the  data  to 
the  smooth  is  achieved.  Projection  pursuit  has  found  a  linear  combination  of  predictor  variables 
that  results  in  a  narrow  band  of  points  when  r  is  plotted  versus  that  linear  combination. 

Once  the  calibration  model  has  been  derived,  prediction  can  be  accomplished  using  two 
different  schemes.  The  first  is  to  examine  the  derived  smooth  and  determine  whether  it 
corresponds  to  a  known  function.  If  an  adequate  function  is  available,  it  can  be  used  in  place  of 
the  smooth  and  the  resulting  model  can  be  used  for  prediction.  This  was  the  approach  taken  by  the 
present  work  to  calibrate  the  array  of  ISEs  because  of  the  good  correspondence  between  the 
smooth  and  the  log  function,  the  ease  of  implementation  of  this  method,  and  the  unavailability  of 
the  true  smooth  function  as  derived  by  the  projection  pursuit  program. 

The  second  approach  to  prediction  is  where  projection  pursuit  is  the  most  versatile.  Instead 
of  replacing  the  smooth  with  a  known  function,  prediction  can  be  performed  using  the  smooth 
itself.  This  is  possible  because  the  smooth  function  is  continuous  within  the  range  defined  by  the 
calibration  samples  and  therefore  interpolation  can  be  performed.  To  illustrate  how  the  smooth  can 
be  used  for  prediction,  one  can  imagine  estimating  a  calibration  model  where  the  instrument 
responses  at  three  wavelengths  are  used  as  predictor  variables  and  the  concentration  of  an  analyte  is 
to  be  estimated.  A  possible  model  may  be, 

c  =  G  (  0.21  ri  +  0.36  r2  +  0.91  r3  )  5) 

where  G  and  a  =  [  0.21  0.36  0.91  ]  correspond  to  the  final  model.  In  this  example  note  that  the 
roles  of  the  response  and  concentration  vectors  are  not  the  same  as  in  the  calibration  of  ISEs. 
Here,  the  response  vectors  are  treated  as  predictor  variables  while  the  concentration  vector  is 
treated  as  a  dependent  variable.  These  are  not  equivalent  approaches,  and  the  choice  of  which 


method  to  use  depends  on  the  error  structure  and  the  goals  of  the  experiment. 

Once  the  calibration  model  has  been  determined  (eq.  5),  the  analyst  can  predict  the 
concentration  of  analytes  in  an  unknown  sample  (ex.  with  response  run  ~  [  1  2  j  ])  by  evaluating 
the  value  of  the  esrimated  smooth  at  the  point  (1)(0.21)  +  (2)(0.36)  +  (3)(0.91)  =  3.66. 

The  advantages  of  using  the  estimated  smooth  in  this  manner  is  that  departures  from 
ideality  are  modelled,  and  representative  models  for  many  systems  that  do  not  follow  "common" 
functional  forms  can  be  obtained.  This  is  because  a  smooth  is  not  constrained  to  follow  any 
predetermined  functionality  and  is  therefore  more  able  to  model  the  behavior  of  the  data.  This 
characteristic  can  make  the  method  very  valuable  in  an  exploratory  sense  where  unknown 
relationships  between  variables  are  sought.  One  caveat  to  the  use  of  this  method  is  that  it  is 
possible  to  overfit  the  data.  Although  the  smooth  used  by  the  projection  pursuit  is  robust  and 
therefore  not  overly  sensitive  to  outliers  (non-representative  calibration  samples),  it  is  still  capable 
of  fitting  noise  as  well  as  data.  Another  limitation  is  that  the  method  is  not  useful  in  situations 
where  the  unknown  sample  lies  outside  of  the  range  of  responses  and  concentrations  defined  by 
the  calibration  set.  When  a  smooth  is  used  as  an  integral  part  of  the  model,  extrapolation  is  not 
possible.  This  is  because  smooths  are  a  product  of  the  data  itself  and  therefore  cannot  be  used  to 
infer  about  behavior  beyond  the  data.  It  should  be  noted,  however,  that  both  of  these  limitations 
are  present  in  cases  where  the  model  is  known  and  the  at  alyst  should  be  aware  of  the  possible 
complications  regardless  of  the  method  employed. 

Calibration  The  functionality  of  the  response  of  an  ISE  to  mixtures  of  analytes  (eq.  2) 
makes  it  very  amenable  to  analysis  using  the  projection  pursuit  algorithm.  The  data  used  in  this 
study  were  taken  from  the  third  experiment  in  the  study  performed  in  reference  3.  The  data 
consisted  of  the  responses  of  five  ISEs  to  mixtures  of  sodium  (0.1200  -  0.1650  M)  and  potassium 
(2.00  -  8.40  mM)  ions  in  aqueous  solutions  (See  table  1  for  concentration  levels  of  the  11 
calibration  and  9  prediction  samples).  An  11x5  matrix  R  was  constructed  with  rows 
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corresponding  to  samples,  and  columns  corresponding  to  sensors  (see  table  2),  likewise,  an  11x2 
matrix  C  was  formed  with  columns  corresponding  to  sodium  and  potassium  ion  concentrations, 
respectively.  To  build  the  model  for  the  jth  electrode,  projection  pursuit  was  used  to  find  ajj,  and 
a2j  as  in  equation  3  where  rj  is  the  response  of  the  jth  sensor  to  the  1 1  calibration  samples;  and  ci 
and  C2  are  the  concentrations  of  sodium  and  potassium  in  these  samples,  respectively. 

For  this  study,  the  first  approach  to  prediction  was  taken  where  the  smooth  was  replaced 
with  a  functional  form.  As  was  expected,  the  log  function  corresponded  well  with  the  derived 
smooth.  To  compare  the  model  results  obtained  using  projection  pursuit  to  those  obtained  in  the 
earlier  study  using  the  same  data,  the  following  steps  were  followed  to  obtain  projection  pursuit 
estimates  of  E°  and  S  for  each  of  the  sensors. 

The  first  step  is  to  assume  the  final  smooth  corresponds  to  the  log  function  for  each  sensor. 
This  is  suggested  by  the  data  as  will  be  discussed  later.  The  a  quantities  and  the  log  function  can 
then  be  used  to  transform  the  predictor  variables  (cj  to  yield  a  new  vector  lj  that  is  linearly  related 


lj  =  log(aijCi  +  a2jC2) 


Regressing  rj  onto  lj  yields  the  following  equation. 


rj  =  Poj  +  Pij  log(«ij  ci  +  a2jc2) 


To  find  the  corresponding  estimates  of  Ej°  and  Sj,  the  following  rearrangements  of  equation  7  can 
be  made. 


=  Poj  +  Pij  Iogfaij  (ci  +  (a2j/  aij)  c2] } 


=  [Poj  +  Pij  log(aij)]  +  Pij  logjc;  +  (ct2j/  cxjj)  C2] 


Comparing  this  equation  to  equation  2,  yields  the  following  equalities. 

Kj  =  a2j/aij  10) 

Sj  =  Pij  11) 

Ej°  =  Poj  +  Pij  log(aij)  12) 

Note  that  equations  6-12,  and  the  prediction  equations  that  follow,  are  based  on  the  assumption 
that  the  smooth  determined  by  projection  pursuit  is  equal  to  the  log  function.  As  stated  earlier,  in 
situations  where  no  functional  form  can  be  found  to  represent  the  smooth,  the  calibration  and 
prediction  models  can  be  constructed  using  the  smooth  function  itself.  This  latter  approach  will  not 
be  employed  in  the  present  work  but  can  prove  to  be  a  powerful  alternative  to  the  more  traditional 
approach. 


Prediction  Once  the  calibration  model  has  been  estimated  for  each  of  the  five  electrodes, 
prediction  of  analyte  concentrations  for  unknown  samples  can  be  accomplished  by  rearranging 
equation  7  and  employing  MLR.  The  first  step  is  to  write  equation  7  in  the  following  form. 


1Q  [(rj  •  Poj)  /  Pijl  _ 


aij  ci  +  a2j  c2 


In  equation  13,  rj ,  C],  and  c2  are  in  plain  text  because  this  equation  is  for  one  unknown  sample  as 
opposed  to  the  general  case  of  I  samples  in  equation  7.  For  each  unknown  sample  the  value  of  rj  is 
measured,  Poj  and  pij  are  estimated,  and  therefore  the  left  hand  side  of  the  equation  is  a  known 
quantity.  Therefore,  for  each  sensor,  one  equation  of  the  following  form  is  found  for  the 
unknown  sample, 

xj  =  aij  ci  +  a2j  c2  14) 


E 


where  xj  is  the  quantity  on  the  left  hand  side  of  equation  13.  This  can  be  rewritten  in  vector 
notation  as  follows. 


xj  =  ctjTc  15) 

If  the  x  values  calculated  for  each  sensor  (for  the  same  unknown  sample)  are  stacked  to  form  a 
column  vector  (x),  and  the  corresponding  otj  for  each  sensor  are  also  stacked  to  form  a  matrix  A, 
the  following  equality  will  hold, 

x  =  A  c  +  e  16) 

where  c  is  a  vector  of  concentrations  for  the  unknown  sample.  This  equation  can  be  solved  using 
MLR  to  yield  estimates  for  the  elements  of  c  in  the  following  manner, 

c  =  (AtAV1At  x  17) 

By  following  these  steps,  the  calibration  model  can  De  used  to  predict  the  concentrations  of 
analytes  in  unknown  samples. 

Results  and  Discussion 


Five  electrodes  were  used  to  generate  the  data  analyzed  in  this  study;  a  Coming  476220 
general  purpose  cation  glass  electrode,  an  Orion  94-1 1  sodium  glass  electrode  with  uncharacteristic 
non-se!ective  behavior  (3),  and  three  plastic  membrane  based  electrodes  whose  construction  will 
not  be  discussed  here.  The  sensors  will  be  referred  to  as  TNO,  GP,  EHPP,  NA,  and  METH  for 
sensors  1-5,  resDectively.  For  a  more  complete  description  of  the  electrodes,  instrumentation,  and 


data  acquisition  the  reader  is  referred  to  reference  3. 

The  first  step  was  to  use  projection  pursuit  to  determine  the  values  of  the  a  quantities  in 
equation  3.  For  this  study,  1 1  calibration  samples  were  used  to  derive  the  parameters  and  the 
resulting  models  were  used  to  predict  the  concentration  of  analytes  contained  in  an  additional  9  test 
samples.  (It  should  be  noted  that  projection  pursuit  is  generally  used  in  situations  where  the 
dimensionality  of  the  problems  and  the  number  of  samples  used  in  the  calibration  step  (I)  are  much 
larger  than  in  this  example.  The  ability  of  projection  pursuit  to  successfully  estimate  the  model 
parameters  in  the  present  study  is  due  to  the  low  level  of  noise  present.)  The  resulting  a  quantities 
for  the  models  of  the  five  sensors  are  listed  in  table  3.  Since  the  procedure  for  model  estimation 
was  identical  for  each  of  the  five  sensors,  the  details  of  only  the  first  sensor  will  be  discussed. 

Assuming  the  functional  form  for  sensor  1  is  not  known,  the  next  step  would  be  to  plot  rj 
versus  the  "best"  linear  combination  of  the  c  vectors  (,919ci+.395  C2).  This  plot  reveals  that 
projection  pursuit  has  done  a  good  job  at  forming  a  tight  band  of  points.  A  close  examination  of 
the  plot  also  suggests  that  the  true  function  describing  the  data  has  a  slight  curvature  (negative 
second  derivative).  If  a  straight  line  is  fit  to  these  points  and  the  residuals  examined  (fig.  4),  it 
becomes  clear  that  there  is  structure  in  the  data  that  is  not  accounted  for  using  a  simple  regression 
line  on  the  untransformed  data.  The  shape  of  the  residual  structure  suggests  a  suitable 
transformation  is  to  descend  the  ladder  of  powers  for  (.919  cj  +  .395  C2)  and  the  log  function  is  a 
reasonable  choice.  In  this  study,  there  was  obviously  a  strong  bias  toward  using  the  log  function 
as  this  is  suggested  by  theory.  In  other  situations  the  analyst  may  not  have  any  a  priori  knowledge 
about  the  relationship  between  dependent  and  independent  variables.  In  those  situations, 
projection  pursuit  may  provide  clues  concerning  the  functional  relationships.  From  this 
information,  hypotheses  can  be  formulated  and  further  investigation  performed  to  verify  or 
disprove  the  hypotheses. 

For  ISEs,  the  log  function  was  suspected  and  the  data  substantiated  the  form  of  equation  2. 
A  plot  of  log(  .919  ci  +  .395  C2)  versus  rj  revealed  a  more  linear  relationship.  Figure  5  is  the 
residual  plot  of  the  regression  of  n  on  the  transformed  linear  combination  of  c  vectors  where  the 
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lack  of  structure  in  the  new  residuals  verifies  the  log  transformation  as  being  acceptable. 
Additionally,  the  log  function  can  be  shown  to  be  appropriate  by  comparing  the  magnitude  of  the 
residuals  (taking  into  account  that  the  log  transformation  has  been  performed).  (Note:  The  original 
experiment  was  set  up  using  a  factorial  design  (10)  to  choose  the  levels  for  the  concentration  of 
analytes.  For  this  reason,  four  sets  of  points  in  figures  4  and  5  are  clustered  together  and  are  not 
ideal  for  determining  the  oprimal  transformation.  If  the  experiment  were  to  be  repeated,  it  would 
be  advisable  to  use  a  different  scheme  to  select  analyte  concentration  levels  for  the  calibration 
samples.) 

The  next  step  of  the  calibration  procedure  is  the  regression  of  the  vector  of  responses  onto 
the  vector  lj  (eq.  7).  This  regression  step  yielded  the  coefficients  found  in  table  4.  These  values 
can  be  used  to  estimate  the  parameters  in  the  original  Nemst  equation  (eq.  2)  using  the  equalities 
found  in  equations  10-12.  The  estimated  parameters  using  projection  pursuit  and  those  of  the 
earlier  study  of  Beebe  et  al.  are  listed  in  table  5.  The  agreement  between  the  two  methods  is  very 
good  and  one  would  expect  the  predictive  abilities  of  the  two  methods  to  be  comparable.  Note  that 
it  is  possible  to  improve  the  projection  pursuit  model  once  the  log  function  was  chosen  to  replace 
the  smooth.  One  approach  to  calibration  is  to  use  projection  pursuit  to  find  the  functional 
relationships  and  some  other  non-linear  regression  technique  to  determine  the  model  parameters 
given  the  estimated  functional  form  for  the  calibration  model. 

Table  6  lists  the  results  of  using  the  projection  pursuit  model  to  predict  the  concentrations 
of  analytes  in  the  nine  test  samples  using  equation  17.  The  results  of  the  earlier  study  using  the 
simplex  model  are  also  included  for  comparison.  These  results  show  that  the  model  derived  using 
the  simplex  method  yielded  slightly  better  prediction  than  the  model  determined  using  projection 
pursuit.  This  is  a  reasonable  result  because  of  the  approach  to  prediction  that  was  followed.  The 
projection  pursuit  smooth  was  assumed  to  equal  the  log  transformation  and  the  a  quantities 
calculated  for  the  smooth  were  used  with  the  log  function.  These  a  quantities  were  not  exactly 
optimal  for  the  log  function  and  therefore  the  projection  pursuit  model  did  not  perform  as  well  as 
the  model  derived  using  the  simplex  procedure. 


As  stated  earlier,  a  second  approach  would  have  been  to  use  the  smooth  itself  as  a  pan  of 
the  calibration  model  (eq.  5).  Both  of  these  are  reasonable  options  and  either  could  be  used 
depending  on  whether  the  analyst  has  more  confidence  in  the  theory  (in  which  case  the  functional 
model  would  be  desirable)  or  the  calibration  samples  (where  the  smooth  would  be  used). 

Conclusion 

Projection  pursuit  has  been  presented  along  with  an  example  of  its  use.  Although  ion 
selective  electrodes  were  used  in  this  study,  it  is  not  to  be  inferred  that  this  is  the  method  of  choice 
for  the  calibration  of  ISEs.  ISE  data  were  used  to  illustrate  the  effectiveness  and  capabilities  of 
projection  pursuit  for  calibration  and  model  estimation  in  general.  Many  other  possible 
applications  can  be  imagined  and  a  variety  of  approaches  to  the  data  analysis  can  be  used 
depending  on  the  particular  type  of  data  at  hand.  The  most  powerful  aspect  of  the  technique  is  that 
there  is  no  need  to  assume  any  functional  relationship  between  variables  under  investigation.  The 
method  can  be  used  to  verify  assumed  relationships,  detect  outliers,  and  determine  functional 
relationships  between  variables  that  are  unknown.  Furthermore,  the  method  is  not  tied  to  any  fixed 
functional  form.  If  the  data  do  not  follow  some  standard  form,  the  more  common  modes  of 
analysis  cannot  be  used  for  model  building.  Using  the  smooths  as  the  functional  forms  allow  the 
analyst  to  build  models  that  are  uniquely  characteristic  of  the  system  under  investigation. 
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Figure  Captions: 


Figure  1.  For  the  hypothetical  example,  plot  of  r  versus  0.37ci  +  0.93C2- 

Figure  2.  Plot  of  r  versus  0.37ci  +  0.93C2  with  a  smooth  included.  This  plot  shows  the 
relatively  poor  fit  of  the  data  to  the  initial  smooth  estimate. 

Figure  3.  Plot  of  r  versus  0.85ci  +  0.53C2  for  the  hypothetical  final  model  where  the  desired 
tight  belt  of  points  is  achieved. 

Figure  4.  Plot  of  residuals  versus  fitted  values  resulting  from  the  regression  of  r  j  onto 
(.919ci  +  ,395c2). 

Figure  5.  Plot  of  residuals  versus  fitted  values  resulting  from  the  regression  of  n  onto 
log(.919ci  +  .395C2). 
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Table  II.  Responses  of  sensors  to  calibration  and  prediction 
samples. 


Calibration 

sample 


Sensor 

TNO  GP  EHPP  NA  METH 


i 


Table  III.  Results  of  projection  pursuit  of  rj  and  C. 


Sensor 

ai 

a2 

TNO 

0.919 

0.395 

GP 

0.152 

0.988 

EHPP 

0.403 

0.915 

NA 

0.310 

0.951 

METH 

0.819 

0.574 

1 
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Table  IV.  Results  of  regression  step  of  calibration  phase. 


Sensor 

Po 

Pi 

TNO 

46.40 

51.63 

GP 

35.32 

49.70 

EHPP 

63.18 

44.81 

NA 

91.34 

55.18 

METH 

46.38 

52.72 

WNt! 


Table  VI.  Predicted  concentrations  and  (%  relative  error)  of  prediction  for  test  samples. 


Model 

Projection  pursuit  Simplex 


Sample 

[Na+]  M 

[K+]  mM 

[Na+]  M 

[K+]  mM 

1 

0.1202  (0.2) 

3.50  (8.4) 

0.1198  (0.2) 

3.73  (2.4) 

2 

0.1999  (0.1) 

6.92  (1.1) 

0.1199  (0.1) 

6.86  (2.0) 

3 

0.1342  (0.6) 

4.26  (11.5) 

0.1351  (0.1) 

3.89  (1.8) 

4 

0.1341  (0.7) 

7.53  (7.6) 

0.1349  (0.1) 

7.15  (2.1) 

5 

0.1493  (0.5) 

2.13  (6.5) 

0.1498  (0.1) 

2.03  (1.5) 

6 

0.1498  (0.1) 

6.99  (0.1) 

0.1497  (0.2) 

7.00  (0.0) 

7 

0.1486  (0.3) 

8.48  (1.0) 

0.1486  (0.9) 

8.45  (0.6) 

8 

0.1668  (1.1) 

3.50  (8.4) 

0.1665  (0.9) 

3.69  (3.4) 

9 

0.1645  (0.3) 

6.79  (3.0) 

0.1641  (0.5) 

7.03  (0.4) 

%  relative  error  (0.4) 
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